-
Notifications
You must be signed in to change notification settings - Fork 28
[WIP] Stream during snapshot #409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Background
Currently, the replication process is effectively linear / "single threaded". When a new sync rules version is deployed, we create a new replication stream, which performs a snapshot on each table, then starts streaming. This has a couple of limitations:
The changes here are also part of the bigger project to implement differential sync rule updates - only re-replicating for changed bucket definitions / sync stream definitions. Part of that requires switching to a single replication stream for all copies of sync rule versions, and this builds the base to implement that.
Changes to storage implementation
Firstly, this changes the underlying
BucketStorageBatchimplementations to be safe under concurrent usage. Right now it is still designed to only have one process doing streaming replication with commits at a time, but it is safe to have multiple other processes doing snapshots concurrently.To implement this, we reduce the reliance on local state, instead using the database state which is safe for concurrent access. While this does not introduce any new model fields yet, we do now rely more strongly on
snapshot_done(keeps track of whether or not we're waiting for any snapshots to complete) andkeepalive_op(keeps track of ops persisted but not committed yet). This has the specific implication that new slots always need an explicitmarkAllSnapshotDone()call - this is not done automatically anymore.This also removes the implementation difference between
keepalive(lsn)andcommit(lsn)- these now both do the same thing.Changes to replication
Starting with Postgres, we now start streaming changes immediately when starting replication, even if a snapshot is required. To avoid consistency issues, we:
This also splits out the snapshot implementation from the streaming replication implementation. The snapshotter keeps a queue of tables to snapshot. Currently it only snapshots one at a time, but we can change this in the future.
Tasks
createEmptyCheckpointslogic.